DMM-Pyramid Based Deep Architectures for Action Recognition with Depth Cameras
نویسندگان
چکیده
We propose a method for training deep convolutional neural networks (CNNs) to recognize the human actions captured by depth cameras. The depth maps and 3D positions of skeleton joints tracked by depth camera like Kinect sensors open up new possibilities of dealing with recognition task. Current methods mostly build classifiers based on complex features computed from the depth data. As a deep model, convolutional neural networks usually utilize the raw inputs (occasionally with simple preprocessing) to achieve classification results. In this paper, we train both traditional 2D CNN and novel 3D CNN for our recognition task. On the basis of Depth Motion Map (DMM), we propose the DMM-Pyramid architecture, which can partially keep the temporal ordinal information lost in DMM, to preprocess the depth sequences so that the video inputs can be accepted by both 2D and 3D CNN models. The combination of networks with different depth is used to improve the training efficiency and all the convolutional operations and parameters updating are based on the efficient GPU implementation. The experimental results applied to some widely used benchmark outperform the state of the art methods.
منابع مشابه
Deep Convolutional Neural Networks for Action Recognition Using Depth Map Sequences
Recently, deep learning approach has achieved promising results in various fields of computer vision. In this paper, a new framework called Hierarchical Depth Motion Maps (HDMM) + 3 Channel Deep Convolutional Neural Networks (3ConvNets) is proposed for human action recognition using depth map sequences. Firstly, we rotate the original depth data in 3D pointclouds to mimic the rotation of camera...
متن کاملThe 16th Meeting on Image Recognition and Understanding RGB-D based 3D-Object Recognition by LLC using Depth Spatial Pyramid
Recently, high-accuracy RGB-D cameras are commertially available, which are capable of providing high quality three dimension information (color and depth). In this paper, we propose an object recognition method where the techniques of object recognition in 2D are extended to 3D. Recent image classification systems mainly consist of the following three parts: feature extraction using scaleinvar...
متن کاملViewpoint Invariant Action Recognition using RGB-D Videos
In video-based action recognition, viewpoint variations often pose major challenges because the same actions can appear different from different views. We use the complementary RGB and Depth information from the RGB-D cameras to address this problem. The proposed technique capitalizes on the spatiotemporal information available in the two data streams to the extract action features that are lar...
متن کاملReal-time Line Detection and Line-based Motion Stereo
Recognition of shape is one of the fundamental problems in computer vision. A number of fast line detection and line-based depth recovery algorithms have been developed on various parallel architectures to meet the requirement of real-time robotic vision. This thesis describes a parallel and hierarchical (pyramidal) approach to fast Hough line detection and line-based motion stereo. The lines a...
متن کاملNaive Bayesian Fusion for Action Recognition from Kinect
The recognition of human actions based on three-dimensional depth data has become a very active research field in computer vision. In this paper, we study the fusion at the feature and decision levels for depth data captured by a Kinect camera to improve action recognition. More precisely, from each depth video sequence, we compute Depth Motion Maps (DMM) from three projection views: front, sid...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014